Facets

PH345: Winter 2025

Phil Boonstra

Horse in Motion

Is there a moment where a galloping horse has all hooves off the ground?

Eadweard Muybridge, 1878. Public Domain

Chromosomes

Schematic representation of late-prophase chromosomes (1000-band stage) of man, chimpanzee, gorilla, and orangutan, arranged from left to right, respectively, to better visualize homology between the chromosomes of the great apes and the human complement.

Figure 2, Yunis and Prakash (1982)

What is this called?

[S]lice the data into parts according to one or more data dimensions, visualize each data slice separately, and then arrange the individual visualizations into a grid (Ch21, Wilke, 2019).

Called ‘small multiples’ (Tufte, 1991), ‘trellis plots’ (Becker et al., 1996), or ‘facet plots’ (Wickham, 2016).

Why do this?

Introduces a third dimension (first two dimensions being x and y). Essentially another aesthetic.

Tufte’s principles of graphical excellence:

  • Present many numbers in small space

  • Make large data sets coherent

What are guidelines?

Use when you want to focus audience attention on how differences

Faceting variable must be categorical

Each mini-plot should (normally) have same structure: common axes, scales, etc

Hadley Wickham (1979-)

New Zealand statistician, Chief Scientist at Posit PBC

Creator of ggplot2 and the tidyverse

John Chambers Award for Statistical Computing (2006); Fellow of ASA (2015); COPSS Presidents’ Award (2019)

Two flavors of faceting in ggplot2

  1. facet_wrap(): wrap a 1D ribbon of plots into 2D grid
  2. facet_grid(): create a 2D grid of plots

See Chapter 16 of ggplot2 book (https://ggplot2-book.org/facet)

We’ve already seen facet_wrap()

# Install and load datasauRus if not already installed
if(!require(datasauRus)) {install.packages("datasauRus");library(datasauRus)}
ggplot(datasaurus_dozen) +
  geom_point(aes(x = x, y = y), size = 1) + 
  facet_wrap(facets = vars(dataset), ncol = 5) +
  guides(color = "none") +
  theme(text = element_text(size = 20)) 

facet_grid() doesn’t ‘work’ here

# Install and load datasauRus if not already installed
if(!require(datasauRus)) {install.packages("datasauRus");library(datasauRus)}
ggplot(datasaurus_dozen) +
  geom_point(aes(x = x, y = y), size = 1) + 
  facet_grid(rows = vars(dataset)) +
  guides(color = "none") +
  theme(text = element_text(size = 20)) 

Doesn’t work as columns either

# Install and load datasauRus if not already installed
ggplot(datasaurus_dozen) +
  geom_point(aes(x = x, y = y), size = 1) + 
  facet_grid(cols = vars(dataset)) +
  guides(color = "none") +
  theme(text = element_text(size = 20)) 

Malaria

  • Spread by the bite of female Anopheles mosquitos, which are hosts to the malaria parasite

  • Exists in tropical and subtropical regions with inadequate public health infrastructure

  • 263 million cases in 2023; 597,000 deaths

US CDC: https://www.cdc.gov/malaria/data-research/index.html;

World Malaria Report, WHO: https://www.who.int/teams/global-malaria-programme/reports/world-malaria-report-2024

Global malaria incidence

WHO collects country-level data on malaria incidence.

Raw data on incidence per 1000-at risk persons available at https://data.worldbank.org/indicator/SH.MLR.INCD.P3

Friendlier data available on canvas (malaria_countries_long.csv)

An unfaceted plot of incidence: Sub-Saharan Africa

# I've downloaded the data from Canvas into my project folder:
malaria_countries <- read_csv("malaria_countries_long.csv")
malaria_countries_ssafrica <- malaria_countries %>% filter(region == "Sub-Saharan Africa")
ggplot(malaria_countries_ssafrica) +
  geom_line(aes(x = year, y = incidence, group = country, color = country), 
            alpha = 0.75)

An faceted plot of incidence: Sub-Saharan Africa

Drop the ‘strips’

# To drop the strips completely, add this to your code:
theme(strip.text = element_blank())

Add labels inside

# To add labels to the plots, use a dataset with just one row per country:
malaria_countries_ssafrica %>%
  group_by(country) %>%
  slice(1)

Arrange countries by the change in incidence from 2000 to 2022

# Create a new variable called delta_incidence that is the difference 
# between the last and first incidence values for each country. 
# Then arrange the data by this variable and turn country into a factor
# with levels in the order that they appear in the data:

mutate(country = factor(country) %>% fct_inorder())

Anchor all panels with the Sub-Saharan-wide trajectory

# I've downloaded the data from Canvas into my project folder:
malaria_all_ssafrica <- read_csv("Malaria/malaria_other_long.csv") %>% filter(name == "Sub-Saharan Africa")

An example with varying y-axes

# Look up the `scales` argument in the `facet_wrap` documentation. Find
# up the documentation with this:
?facet_wrap

Think about differences between choice of varying y-axes: Varying y-axes are essentially many individual plots. Varying y-axes make it easier to evaluate differences within a panel but harder to compare between panels

Unemployment in USA

Jorge Camoes, https://excelcharts.com/charts-monthly-unemployment-rates-by-state-1976-2009/

Wrapped facet by US States, ordered by size of workforce. Easy to compare trends across states. Unemployment band provides anchors

Life expectancy in USA

Anna Maria Barry-Jester, https://fivethirtyeight.com/features/as-u-s-life-expectancies-climb-people-in-a-few-places-are-dying-younger/

Sort of a facet grid by categorized latitude and longtitude. The trend line provides an anchor. The amount of purple and orange allow to immediately determine

Code Together Task

No Spice: Make the standard faceted plot on slide 15;

Weak Sauce: Make the faceted plot without strips on slide 16; Make the faceted plot with varying y-axes on slide 20;

Medium Spice: Make the faceted plot with inside labels on slide 17

Yoga Flame: Make the faceted plot with reordered facets on slide 18

Dim Mak: Make the faceted plot with the Sub-Saharan Africa trajectory on slide 19

References

Becker, R.A., Cleveland, W.S. and Shyu, M.J., 1996. The visual design and control of trellis display. Journal of computational and Graphical Statistics, 5(2), pp.123-155.

Tufte, E.R., 1991. Envisioning information. Optometry and Vision Science, 68(4), pp.322-324.

Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.

Wilke, C.O., 2019. Fundamentals of data visualization: a primer on making informative and compelling figures. O’Reilly Media.

Yunis, J.J. and Prakash, O., 1982. The origin of man: a chromosomal pictorial legacy. Science, 215(4539), pp.1525-1530.